181 research outputs found

    A Survey on Compiler Autotuning using Machine Learning

    Full text link
    Since the mid-1990s, researchers have been trying to use machine-learning based approaches to solve a number of different compiler optimization problems. These techniques primarily enhance the quality of the obtained results and, more importantly, make it feasible to tackle two main compiler optimization problems: optimization selection (choosing which optimizations to apply) and phase-ordering (choosing the order of applying optimizations). The compiler optimization space continues to grow due to the advancement of applications, increasing number of compiler optimizations, and new target architectures. Generic optimization passes in compilers cannot fully leverage newly introduced optimizations and, therefore, cannot keep up with the pace of increasing options. This survey summarizes and classifies the recent advances in using machine learning for the compiler optimization field, particularly on the two major problems of (1) selecting the best optimizations and (2) the phase-ordering of optimizations. The survey highlights the approaches taken so far, the obtained results, the fine-grain classification among different approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated quarterly here (Send me your new published papers to be added in the subsequent version) History: Received November 2016; Revised August 2017; Revised February 2018; Accepted March 2018

    Self-unloading, reusable, lunar lander project

    Get PDF
    In the early 21st century, NASA will return to the Moon and establish a permanent base. To achieve this goal safely and economically, B&T Engineering has designed an unmanned, reusable, self-unloading lunar lander. The lander is designed to deliver 15,000 kg payloads from an orbit transfer vehicle (OTV) in a low lunar polar orbit and an altitude of 200 km to any location on the lunar surface

    Pretenuring for Java

    Get PDF
    Pretenuring is a technique for reducing copying costs in garbage collectors. When pretenuring, the allocator places long-lived objects into regions that the garbage collector will rarely, if ever, collect. We extend previous work on profiling-driven pretenuring as follows. (1) We develop a collector-neutral approach to obtaining object lifetime profile information. We show that our collection of Java programs exhibits a very high degree of homogeneity of object lifetimes at each allocation site. This result is robust with respect to different inputs, and is similar to previous work on ML, but is in contrast to C programs, which require dynamic call chain context information to extract homogeneous lifetimes. Call-site homogeneity considerably simplifies the implementation of pretenuring and makes it more efficient. (2) Our pretenuring advice is neutral with respect to the collector algorithm, and we use it to improve two quite different garbage collectors: a traditional generational collector and an older-first collector. The system is also novel because it classifies and allocates objects into 3 categories: we allocate immortal objects into a permanent region that the collector will never consider, long-lived objects into a region in which the collector placed survivors of the most recent collection, and shortlived objects into the nursery, i.e., the default region. (3) We evaluate pretenuring on Java programs. Our simulation results show that pretenuring significantly reduces collector copying for generational and older-first collectors. 1

    Predictive Modeling in a Polyhedral Optimization Space

    Get PDF
    International audienceIt has been shown that the right set of polyhedral optimizations can make a significant difference in the running time of a program. However, the number of alternatives for a program transformed with polyhedral optimizations can be enormous. Thus, it is often impossible to iterate over a significant fraction of the entire space of polyhedral transformed variants. Recent research has focused on iterating over this search space with manually-constructed heuristics or with expensive search algorithms (e.g., genetic algorithms) that can eventually find good points in the polyhedral space. In this paper, we evaluate methods of modeling the polyhedral optimization problem using machine learning techniques. We show that these techniques can quickly find excellent program variants in the polyhedral space. We introduce two different modeling techniques, we call a "one-shot" model and a "reactive" model. Our one-shot model takes as input "one" characterization of the program and outputs several different variants that are predicted to give good performance. In contrast, our "reactive" model takes as input the characterization of the program that was last predicted to give good performance. Unlike the one-shot model, our reactive model explores the space of polyhedral program variants obtaining characterizations of each program variants and using these characterizations for the model to decide where to traverse the search space next. We show that our reactive predictor outperforms our one-shot model and can significantly outperform the most aggressive setting of state-of-the-art model-based heuristics in a just a few iterations

    Predictive Modeling in a Polyhedral Optimization Space

    Get PDF
    International audienceHigh-level program optimizations, such as loop transformations, are critical for high performance on multi-core targets. However, complex sequences of loop transformations are often required to expose parallelism (both coarse-grain and fine-grain) and improve data locality. The polyhedral compilation framework has proved to be very effective at representing these complex sequences and restructuring compute-intensive applications, seamlessly handling perfectly and imperfectly nested loops. Nevertheless identifying the most effective loop transformations remains a major challenge. We address the problem of selecting the best polyhedral optimizations with dedicated machine learning models, trained specifically on the target machine. We show that these models can quickly select high-performance optimizations with very limited iterative search. Our end-to-end framework is validated using numerous benchmarks on two modern multi-core platforms. We investigate a variety of different machine learning algorithms and hardware counters, and we obtain performance improvements over productions compilers ranging on average from 3.2x to 8.7x, by running not more than 6 program variants from a polyhedral optimization space

    MiCOMP: Mitigating the Compiler Phase-Ordering Problem Using Optimization Sub-Sequences and Machine Learning

    Get PDF
    Recent compilers offer a vast number of multilayered optimizations targeting different code segments of an application. Choosing among these optimizations can significantly impact the performance of the code being optimized. The selection of the right set of compiler optimizations for a particular code segment is a very hard problem, but finding the best ordering of these optimizations adds further complexity. Finding the best ordering represents a long standing problem in compilation research, named the phase-ordering problem. The traditional approach of constructing compiler heuristics to solve this problem simply cannot cope with the enormous complexity of choosing the right ordering of optimizations for every code segment in an application. This article proposes an automatic optimization framework we call MiCOMP, which Mitigates the Compiler Phase-ordering problem. We perform phase ordering of the optimizations in LLVM’s highest optimization level using optimization sub-sequences and machine learning. The idea is to cluster the optimization passes of LLVM’s O3 setting into different clusters to predict the speedup of a complete sequence of all the optimization clusters instead of having to deal with the ordering of more than 60 different individual optimizations. The predictive model uses (1) dynamic features, (2) an encoded version of the compiler sequence, and (3) an exploration heuristic to tackle the problem. Experimental results using the LLVM compiler framework and the Cbench suite show the effectiveness of the proposed clustering and encoding techniques to application-based reordering of passes, while using a number of predictive models. We perform statistical analysis on the results and compare against (1) random iterative compilation, (2) standard optimization levels, and (3) two recent prediction approaches. We show that MiCOMP’s iterative compilation using its sub-sequences can reach an average performance speedup of 1.31 (up to 1.51). Additionally, we demonstrate that MiCOMP’s prediction model outperforms the -O1, -O2, and -O3 optimization levels within using just a few predictions and reduces the prediction error rate down to only 5%. Overall, it achieves 90% of the available speedup by exploring less than 0.001% of the optimization space

    COBAYN: Compiler autotuning framework using Bayesian networks

    Get PDF
    The variety of today's architectures forces programmers to spend a great deal of time porting and tuning application codes across different platforms. Compilers themselves need additional tuning, which has considerable complexity as the standard optimization levels, usually designed for the average case and the specific target architecture, often fail to bring the best results. This article proposes COBAYN: Compiler autotuning framework using Bayesian Networks, an approach for a compiler autotuning methodology using machine learning to speed up application performance and to reduce the cost of the compiler optimization phases. The proposed framework is based on the application characterization done dynamically by using independent microarchitecture features and Bayesian networks. The article also presents an evaluation based on using static analysis and hybrid feature collection approaches. In addition, the article compares Bayesian networks with respect to several state-of-the-art machine-learning models. Experiments were carried out on an ARM embedded platform and GCC compiler by considering two benchmark suites with 39 applications. The set of compiler configurations, selected by the model (less than 7% of the search space), demonstrated an application performance speedup of up to 4.6× on Polybench (1.85× on average) and 3.1× on cBench (1.54× on average) with respect to standard optimization levels. Moreover, the comparison of the proposed technique with (i) random iterative compilation, (ii) machine learning-based iterative compilation, and (iii) noniterative predictive modeling techniques shows, on average, 1.2×, 1.37×, and 1.48×speedup, respectively. Finally, the proposed method demonstrates 4×and 3×speedup, respectively, on cBench and Polybench in terms of exploration efficiency given the same quality of the solutions generated by the random iterative compilation model

    Docosahexaenoic acid regulates the formation of lipid rafts: A unified view from experiment and simulation

    Get PDF
    Docosahexaenoic acid (DHA, 22:6) is an n-3 polyunsaturated fatty acid (n-3 PUFA) that influences immunological, metabolic, and neurological responses through complex mechanisms. One structural mechanism by which DHA exerts its biological effects is through its ability to modify the physical organization of plasma membrane signaling assemblies known as sphingomyelin/cholesterol (SM/chol)-enriched lipid rafts. Here we studied how DHA acyl chains esterified in the sn-2 position of phosphatidylcholine (PC) regulate the formation of raft and non-raft domains in mixtures with SM and chol on differing size scales. Coarse grained molecular dynamics simulations showed that 1-palmitoyl-2-docosahexaenoylphosphatylcholine (PDPC) enhances segregation into domains more than the monounsaturated control, 1-palmitoyl-2-oleoyl-phosphatidylcholine (POPC). Solid state 2H NMR and neutron scattering experiments provided direct experimental evidence that substituting PDPC for POPC increases the size of raft-like domains on the nanoscale. Confocal imaging of giant unilamellar vesicles with a non-raft fluorescent probe revealed that POPC had no influence on phase separation in the presence of SM/chol whereas PDPC drove strong domain segregation. Finally, monolayer compression studies suggest that PDPC increases lipid-lipid immiscibility in the presence of SM/chol compared to POPC. Collectively, the data across model systems provide compelling support for the emerging model that DHA acyl chains of PC lipids tune the size of lipid rafts, which has potential implications for signaling networks that rely on the compartmentalization of proteins within and outside of rafts

    Medical decision making for patients with Parkinson disease under Average Cost Criterion

    Get PDF
    Parkinson's disease (PD) is one of the most common disabling neurological disorders and results in substantial burden for patients, their families and the as a whole society in terms of increased health resource use and poor quality of life. For all stages of PD, medication therapy is the preferred medical treatment. The failure of medical regimes to prevent disease progression and to prevent long-term side effects has led to a resurgence of interest in surgical procedures. Partially observable Markov decision models (POMDPs) are a powerful and appropriate technique for decision making. In this paper we applied the model of POMDP's as a supportive tool to clinical decisions for the treatment of patients with Parkinson's disease. The aim of the model was to determine the critical threshold level to perform the surgery in order to minimize the total lifetime costs over a patient's lifetime (where the costs incorporate duration of life, quality of life, and monetary units). Under some reasonable conditions reflecting the practical meaning of the deterioration and based on the various diagnostic observations we find an optimal average cost policy for patients with PD with three deterioration levels
    • …
    corecore